feat(worker): pass db_session to ProcessingState in processor#744
feat(worker): pass db_session to ProcessingState in processor#744thomasrockhu-codecov wants to merge 3 commits intotomhu/processing-state-db-pathfrom
Conversation
d2d6c17 to
7e53294
Compare
6ce1a54 to
9c656c6
Compare
7e53294 to
010a532
Compare
9c656c6 to
e0b3bdf
Compare
010a532 to
a5d77c0
Compare
700f40b to
39bd0c9
Compare
e0b3bdf to
696d952
Compare
39bd0c9 to
f43f178
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Activate the DB-backed state path in process_upload() by passing db_session to ProcessingState. The processor now writes PROCESSED state to the database instead of Redis. Also removes the should_trigger_postprocessing check and direct finisher triggering from the processor -- this orphaned-task recovery will be replaced by the gate key mechanism in a later PR. Made-with: Cursor
696d952 to
3b30446
Compare
f43f178 to
1361510
Compare
The DB-backed path was skipping Redis writes, but the finisher still reads from Redis. Keep writing to both until the finisher migrates to DB-backed state in a later PR. Made-with: Cursor
1361510 to
81eb7d9
Compare
| def mark_uploads_as_processing(self, upload_ids: list[int]): | ||
| if not upload_ids or self._db_session: | ||
| if not upload_ids: | ||
| return |
There was a problem hiding this comment.
Bug: Removing the self._db_session guard in mark_uploads_as_processing causes task retries to corrupt Redis state, preventing the finisher task from being triggered.
Severity: CRITICAL
Suggested Fix
Restore the self._db_session check in the guard condition within mark_uploads_as_processing. The line should be if not upload_ids or self._db_session: return. This ensures the function is a no-op for Redis when a database session is provided, preventing state corruption during retries.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: apps/worker/services/processing/state.py#L130-L132
Potential issue: In a task retry scenario where a task fails after an upload is marked
as processed, the `mark_uploads_as_processing` function is called again. Due to the
removal of a check for `self._db_session`, this function re-adds the already processed
upload ID to the Redis "processing" set. This results in the upload ID existing in both
the "processing" and "processed" Redis sets. Consequently, a subsequent safety check,
`should_trigger_postprocessing`, incorrectly determines that processing is not complete,
which prevents the finisher task from running and leaves the commit in a stuck state.
There was a problem hiding this comment.
Pre-existing behavior identical to main branch. mark_uploads_as_processing always wrote to Redis on main (no db_session guard existed). This PR restores that same behavior for dual-write. Not a new regression.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## tomhu/processing-state-db-path #744 +/- ##
==================================================================
- Coverage 92.25% 92.24% -0.01%
==================================================================
Files 1304 1304
Lines 47949 47950 +1
Branches 1628 1628
==================================================================
Hits 44233 44233
- Misses 3407 3408 +1
Partials 309 309
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
When db_session is present, both the DB path and the Redis fall-through path were incrementing CLEARED_UPLOADS. Move the Redis srem inside the DB block and return early so the metric is only counted once. Made-with: Cursor
Summary
db_sessiontoProcessingStateinprocess_upload(), activating the DB-backed state path added in PR 2a. The processor now writesstate_id = PROCESSEDto the database instead of Redis.should_trigger_postprocessingcheck and direct finisher triggering from the processor. This orphaned-task recovery mechanism will be replaced by the gate key in PR 4a.celery_app,upload_finisher_task_name).Stacked on #742.
Test plan
test_successful_processing_marks_upload_as_processedverifies DB state transitionMade with Cursor
Note
Medium Risk
Touches upload processing state transitions and dual DB/Redis bookkeeping; mistakes could miscount uploads or trigger post-processing incorrectly, but changes are localized and covered by an updated test.
Overview
The processor now constructs
ProcessingStatewithdb_session, so successful processing persistsUpload.state_id = PROCESSEDin the DB while still dual-writing Redis sets for tracking.ProcessingStatetightens Redis cleanup in DB-backed paths (e.g., alwayssremfrom theprocessingset on clear, and continues Redis moves onmark_upload_as_processedeven when DB is updated).Tests drop the mock-heavy orphaned-finisher scenarios and replace them with a single assertion that a successful
process_uploadupdates the upload’s DBstate_idtoPROCESSED(without changing the stringstate).Written by Cursor Bugbot for commit b0374a1. This will update automatically on new commits. Configure here.